Meta-game Equilibrium for Multi-agent Reinforcement Learning
نویسندگان
چکیده
This paper proposes a multi-agent Q-learning algorithm called meta-game-Q learning that is developed from the meta-game equilibrium concept. Different from Nash equilibrium, meta-game equilibrium can achieve the optimal joint action game through deliberating its preference and predicting others’ policies in the general-sum game. A distributed negotiation algorithm is used to solve the meta-game equilibrium problem instead of using centralized linear programming algorithms. We use the repeated prisoner’s dilemma example to empirically demonstrate that the algorithm converges to meta-game equilibrium.
منابع مشابه
Adapting Reinforcement Learning for Computer Games: Using Group Utility Functions
Group utility functions are an extension of the common team utility function for providing multiple agents with a common reinforcement learning signal for learning cooperative behaviour. In this paper we describe what group utility functions are and suggest using them to provide non-player computer game character behaviours. As yet, reinforcement learning techniques have very rarely been used f...
متن کاملSpike-based Decision Learning of Nash Equilibria in Two-Player Games
Humans and animals face decision tasks in an uncertain multi-agent environment where an agent's strategy may change in time due to the co-adaptation of others strategies. The neuronal substrate and the computational algorithms underlying such adaptive decision making, however, is largely unknown. We propose a population coding model of spiking neurons with a policy gradient procedure that succe...
متن کاملThree Perspectives on Multi-Agent Reinforcement Learning
This chapter concludes three perspectives on multi-agent reinforcement learning (MARL): (1) cooperative MARL, which performs mutual interaction between cooperative agents; (2) equilibrium-based MARL, which focuses on equilibrium solutions among gaming agents; and (3) best-response MARL, which suggests a no-regret policy against other competitive agents. Then the authors present a general framew...
متن کاملMulti-Agent Evolutionary Game Dynamics and Reinforcement Learning Applied to Online Optimization of Traffic Policy
This chapter demonstrates an application of agent-based selection dynamics to the traffic assignment problem. We introduce an evolutionary dynamic approach that acquires payoff data from multi-agent reinforcement learning to enable a adaptive optimization of traffic assignment, provided that classical theories of traffic user equilibrium pose the problem as one of global optimization. We then s...
متن کاملReinforcement Learning for Nash Equilibrium Generation
We propose a new conceptual multi-agent framework which, given a game with an undesirable Nash equilibrium, will almost surely generate a new Nash equilibrium at some predetermined, more desirable pure action profile. The agent(s) targeted for reinforcement learn independently according to a standard model-free algorithm, using internally-generated states corresponding to high-level preference ...
متن کامل